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Abstract 



A group membership protocol ensures agreement and consistent commit actions among group 
members to maintain a sequence of identical group views in spite of continuous changes, 
either voluntary or otherwise, in processors’ membership status. In asynchronous distributed 
environments, such consistency among group views must be guaranteed using messages over 
a network which does not bound message delivery times. Assuming a network that provides 
a reliable, FIFO channel between any pair of processors, one approach to designing such a 
protocol is to centralize the responsibility to detect changes, ensure agreement, and commit 
them consistently in a single manager process. This approach is complicated by the fact that 
a protocol to elect a new manager with a consistent membership proposal must be executed 
when the manager itself fails. In this report, we present a membership protocol based on 
ordering of group members in a logical ring that eliminates the need for such centralized 
responsibility. Agreement and commit actions are token-based and the protocol ensures 
that no tokens are lost or duplicated due to changes in membership. The cost of committing 
a change is 2 n point-to-point messages over FIFO channels where n is the group size. The 
protocol correctness has been proven formally. 



1 Introduction 



Consistent views of the membership of a group of entities that cooperate to perform a task is 
basic to construction of distributed applications using the process group approach [BSS91] . 
The group of entities may correspond to a set of processes that must behave consistently 
to provide a service or a set of processors that must determine their function based on 
which other processors are operational. Changes to the membership occur when members 
fail or leave the group and when they recover or join the group. Some form of consensus 
on group membership is necessary, for without it, a server that respects its specification 
may nonetheless behave inconsistently with respect to another server since they see different 
group members. The group membership problem refers to achieving such consensus. Its 
solution refers to a group membership protocol (GMP). Absence of shared memory in a 
distributed system requires a GMP to rely on message passing alone. 

Typically, availability of a GMP supports construction of reliable communication primitives 
which in turn simplify construction of distributed applications. For example, guarantees 
about multicast communication in the presence of failures require an underlying GMP [B + 90, 
CM84]. Aside from the basic requirements of safety and liveness , a GMP can be evaluated 
in terms of how well it supports the required communication primitives. Prompt response 
to membership changes and ability to support changes continuously (fie., without stalling 
the application) are two of the desirable performance features of a GMP. 

The design and complexity of a GMP depend critically on whether it operates in a syn- 
chronous or asynchronous distributed system. In the former, a GMP exploits tight syn- 
chronization among the clocks of the interacting processes and/or known upperbounds on 
message delivery times. It is possible for all application messages to wait till changes to 
membership are complete and for all membership changes to wait till all pending messages 
are sent. Examples of such GMPs are [Cri88, EdL90, KGRS9]. 

In asynchronous systems, there is no relationship among clocks of the interacting processes 
and message delivery times are unbounded. Therefore, crashes are indistinguishable from 
communication delays or slow members. It is only possible to perceive failures. It is necessary 
that members perceived to have failed be removed from the group since it is impossible to 



reach consensus on a failure [FLP85]. In this report, we deal with GMPs for asynchronous 
systems only. The basic function of a GMP in an asynchronous system is to ensure that 
all operational members commit perceived changes to their local views consistently. The 
consistent commit entails agreement about the change perceived. 

Several GMPs have been proposed for asynchronous systems. In [Bru85], failure/recovery 
detection and notification are achieved using successive message rounds. Maintaining con- 
sistent views is the responsibility of higher level software. The number of messages required 
scales nonlinearly with the number of members and the recovery protocol requires a priori 
knowledge of the potential members. Several GMPs are proposed in [LSA91] based on total 
ordering of messages. Such ordering has a high overhead cost and assumes a fault- tolerant, 
reliable broadcast communication protocol. In [CM84], reliable broadcasts are supported 
by rotating a membership list (token-list) among operational members. When a member 
holding the token list fails, a reformation phase is entered which guarantees that a single 
new token-list is generated and committed to by all members. During this phase, normal 
message traffic is suspended and handling of changes needs an extension to the protocol. 

In [BJ87], a two-phase site- view management protocol is proposed to support higher level 
fault-tolerant communication primitives. Its drawback of blocking during continuous failures 
and recoveries is removed in the formal solution proposed in [RB91]. Assuming a completely 
connected network of reliable FIFO channels and fail-stop behavior of member processes, 
this GMP uses a two-phase algorithm for the basic membership update and a three-phase 
algorithm when the reconfiguration manager itself fails. Election of a new manager with a 
consistent membership proposal must avoid invisible commits. 

In this report, we describe a GMP for asynchronous systems to support reliable communi- 
cation primitives required for virtually synchronous process group approach of [BSS91]. All 
application level communication between members of a group is assumed to carry a view 
number. It is required that each increment of the view number be associated with successive 
views that differ in only one member. Using a fully connected network of reliable FIFO 
channels, the proposed GMP guarantees that a given view number is associated with the 
same membership at any operational member. 

The proposed GMP eliminates the need for centralizing the responsibility of ensuring con- 
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sistency of view changes as in [RB91] by maintaining the group view ordered as a logical 
ring at each member. Each member perceives the departure of a neighboring member and 
joining members enter on one side of a virtual marker whose position is maintained by all 
the members. Agreement and commit actions are achieved using tokens circulated along 
the logical ring. The protocol is able to regenerate lost tokens and ignore duplicate ones 
generated during its operation. 

This report is organized as follows. In section 2, the terminology used in the description of 
the protocol is established and our assumptions are listed. In section 3, the algorithms used 
in this GMP are described. In section 4, the correctness proof is presented. The report ends 
with concluding remarks in section 5. 



2 GMP Overview 

2.1 Assumptions 



The proposed GMP makes the following assumptions. A reliable FIFO communication 
channel between any two members that are operational is assumed. In other words, it is 
assumed that the network is never partitioned. All failures are assumed to be crash or 
fail-stop [CriSS] . This implies that a message sent will not be delivered only because of 
the receiver’s failure. However, it may be arbitrarily delayed. Continuous changes to the 
membership are allowed; however, the changes are committed one at a time. A member gets 
added when a join request is processed and gets deleted when a departure is perceived. A 
group name is assumed to be public to those processes that may wish to join the group. A 
mechanism, whereby a process wishing to join a group can locate a site already running a 
member of the group it wants to join, is assumed to be available. 



2.2 Overview 



The proposed GMP guarantees that view changes and their sequence at each operational 
member are identical. Using a view number in all group-related communication guarantees 
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Figure 1: A Logical ring 

that reliable communication primitives can be built. The principle feature of this GMP is 
that there is no central element either to detect a change in membership status or to guarantee 
consistency of a commit action on the view of group membership. Both are achieved in a 
distributed manner using a logical ring which is simply a conceptual circular ordering of the 
members. 

A logical ring has no relation with the physical locations of the members. Given such a ring 
and a direction of traversing it (arbitrarily, clockwise is selected), each member periodically 
queries its counter-clockwise neighbor for its status. The neighbor then responds with a 
status message when it receives this query. It, in its turn, sends a status query to its 
counter-clockwise neighbor. Thus, every member monitors one other member and is itself 
monitored by a third member. For example, if there are 6 members p 0 to p 5 , a logical ring 
can be configured in which po is an counter-clockwise neighbor of pi and clockwise neighbor 
of p 5 , pi is an counter-clockwise neighbor of p 2 and clockwise neighbor of p 0 , and so on. pi 
sends a status query to p 0 and p 0 responds with a status message to pi. The status message 
from po is monitored by pi. This is illustrated in Fig. 1. 
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Initially, the ring configuration is known to all the members. As members change status, 
the ring configuration changes. The MP treats the cases of a member leaving the group in 
the same manner as a member joining the group. 1 The protocol maintains appropriate 
information at operational members to determine whom each member must monitor. When a 
member departs voluntarily, it simply stops responding to the status query from its monitor. 
If a failure occurs, it is unable to respond to its monitor. In either case, if a monitor does not 
receive a status message within a certain time interval after sending a query, the monitored 
member is perceived to have left the group. A sequence of actions to ensure that all the 
operational members consistently commit to this change is then invoked. When a member 
recovers or wishes to join anew, it sends a join request to the first group member it can 
locate. This member registers the request and invokes a sequence of actions, similar to that 
of departure processing, to ensure that consistent integration of the incoming member takes 
place. 



2.2.1 Processing of Individual Changes 

There are two phases in the protocol to process a join or a departure, viz., the agreement 
phase and the commit phase. These phases are token-based and guarantee that each token 
is processed exactly once by each member and is never lost. Processing of individual view 
changes is described below. More detailed description of the actions taken in each phase is 
given in the next section. 

Departure Processing: 

Once a member perceives the departure of its monitored member because it does not receive 
a status message in response to its query for a predetermined time interval, it initiates 
the agreement phase by sending an agreement token to its clockwise neighbor. It also 
starts monitoring the counter-clockwise neighbor of the member perceived to have departed. 
The agreement token is passed around the ring in the clockwise direction by each member 
passing it on to its clockwise neighbor. When this token circulates back to the agreement 
initiator, it has gone completely around the ring once and all the operational members have 
information indicating that the group has reached an agreement on the departure perceived. 

failures amount to a member leaving involuntarily and recoveries amount to a member joining as a new 
one. 
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The agreement initiator then starts the commit phase by generating a commit token which is 
circulated around the ring in the same manner as in the agreement phase. All the members 
receiving this token commit the change by removing the departed member from their group 
view and updating the view number. 

Join Processing: 

The protocol maintains a logical marker in the ring as the position between some pair of 
adjacent operational members at initialization. The clockwise member of this pair is desig- 
nated as the host of the logical ring and is known to all members initially. As shown in Fig. 
1, a new member always enters the group as the counter-clockwise neighbor of the host who 
has the responsibility of carrying out the agreement and commit phases for the new member. 
A member that, receives a join request from a potential member registers the request and 
sends it clockwise along the ring. When it reaches the host, it takes on the responsibility 
of carrying out the agreement arid join phases of the join in a manner similar to the depar- 
ture processing. It makes the incoming member its monitored neighbor and delivers local 
membership view, view number, and other related information to it. 

Both, departure and join processing must deal with the possibility of changes to membership 
during the agreement and commit phases. These are explained using the following definitions. 



2.3 Definitions 

Each member maintains a set containing all the operational members corresponding to its 
current group view. In addition, each member maintains a status table which stores the 
perceived state of all the members that are in the process of departing or joining. This table 
is used by a member to reject any duplicate tokens generated due to the departure of a 
member in the ring in the middle of any phase. There is a pool of all the tokens received 
by a member wherein all the tokens transferred to the neighbor are stored until removed 
by the update policy described later. This pool is maintained in the order of receipt and 
is managed so that no token is lost upon the failure of a member. Using the current group 
view and the status table, each member determines the member it must monitor. 
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Group Membership Problem: Every member, pi, associates an integer, vn, with its cur- 
rent group view, denoted by the set GV vn (pi), and increments it by one for every view 
change committed. Solution of the group membership problem requires that 

6 GV vn (Pi)and\/n<vn,GV n (p j ) = GV n (p i ) 

A GMP is safe if it guarantees the above. In the following, unless necessitated by the 
context, the view number will be dropped as a subscript. 

Logical Ring: Assume a set of members, GV = {p 0 .pi,p 2 , • • • , p n - 1 }- A circular sequence 
of these members regardless of their physical interconnection is called a logical ring. 

Members along the ring can be visited by traversing it either clockwise or counter- 
clockwise. Given such a ring, a direction of traversing it, and a member, say pi , a 
relation between members gets defined by visiting each remaining member once along 
the ring, in order, and returning to pi from the last member visited. 

Ring Relation (RR): Given two members, pj . pk G GV, pj pk (read as pj is followed 
by pk with respect to p t -) if pk is visited after pj when starting from pi. 

Clearly, given a ring and a direction of traversal, such a relation can be defined with 
respect to every member in GV. On the other hand, given the above ring relation for 
any p,\ the logical ring has a ring property. 

Ring Property: 

V Pi,pj,Pk G GV if pj A p k , then p k ^ Pi and pi ^ pj 

Every member orders its own group view as a logical ring wdth the above property. For 
a logical ring, a hypothetical marker fixed along the ring is defined. 

Logical Marker: A logical marker is an fixed imaginary position between some pair of 
members along a logical ring. 

Its adjacent members may change due to departures and joins. 

Ring Host: Phoit is the first operational member clockwdse from the logical marker. 

Every member p t - keeps track of the position of the logical marker by ordering GV(pi) 
as a logical ring with respect to phoit- 
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Rank: rank Pt (pj ), of any pj € GV(pi) is defined as the number of members between phost 
and itself with rank Pi (pho>t) defined to be 0. 

Monitoring Member: Every pi maintains p mon {i) as the last member to query it for its 
health. 



2.3.1 Tokens 

The proposed GMP is based on circulation of three types of tokens to achieve agreement and 
consistent commit among members. The agreement token initiated at pi for pj perceived to 
have departed or joined is denoted as agree Pi (pj). Similarly, the commit token initiated at pi 
for pj perceived to have departed or joined is denoted as commit Pi (pj). Every token carries 
information about whether it is for a departure or join. 

When a join request is received by a member other than the host, the member creates a 
join request token. joinreq Pi (pj), and passes it on to its clockwise neighbor. When the host 
receives it, it generates and circulates the agreement and commit tokens for the join. If the 
host is the first member to receive the join request, it generates the agreement token directly. 

It should be noted that the initiators of the agreement and commit tokens for a given change 
need not be identical and also need not be the same as the members that perceived the 
changes in the first place. It is possible that p 2 might perceive the failure of its neighbor pi 
and, before initiating the agreement phase, might itself fail. Then its neighbor pz would first 
initiate agreement processing for the P2 and then initiate agreement for p\. If p$ fails before 
the agreement phase is complete then its neighbor p 4 would commit the failure of p 1? p 2 and 
P 3 - 



Every member pi maintains a local status table, denoted as ST Pi . A member has an entry 
in this table at p; only if it has been perceived to have departed but not yet committed 
out of GV(pi) or if it is perceived to have joined but is not yet committed into GV(pi). 
This property is crucial to the safety of the protocol. The five possible values of ST Pi (pj) 
are: Departure Agreed, JoinAgreed, DeparturePending, JoinRequested, and JoinPending. The 
pending status is used to delay the committing of a change at a particular member so that 
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Table 1: Interpretation of ST Pt (pj ) 



Departure Agreed 


Agreement token for departure of pj received, 
but it is not committed, pj G GV Pi is true 


Join Agreed 


same as above, for a join, pj GV Pi is true 


Departure .Pending 


Commit token for departure of pj received, 
but it is not processed, pj G GV Pi is true 


Join Pending 


Commit token for join of pj received, 
but it is not processed, pj 0 GV Pi is true 


J oinRequested 


Pi has seen the join request from pj on 
its way to the host 



the order of changes at all the operational members is identical. The rank of a member is 
used to determine if this status should be assigned to a member at the time the commit 
token for it has been received. Their interpretation is summarized in Table 1. 

Every member p t - maintains a pool of all the tokens it receives, denoted as TknPool(pi), in 
the order they are received. Tokens from this pool are deleted carefully because the receiver 
of a token may depart before receiving it or immediately after receiving it and the token is 
likely to get lost. To prevent such loss, the principle followed in token deletion is to retain 
a token at a member until it is guaranteed that its use is complete. The token pool update 
policy is described in the next section. 



2.3.2 Neighbor and Host Computation 

The following rules determine phost(Pi), the clockwise neighbor cwnbr(pi), and the counter- 
clockwise neighbor acwnbr[pi ) using the ring relation on GV(pi) and the status table ST Pi . 



Rule to determine a new p hoit : At pho, t = pj G GV(pi) such that V pk{ ^ Pj) G 
GV(pi ), pj pk where p 0 id is the old host. 

This rule assigns the operational clockwise neighbor of p 0 id as the new ph olt and is 
invoked to compute the new host every time a member commits the departure of its 
Phost • It should be noted that selection of the new host is determined only by the 
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current GV(pi) and not along with ST Pi . Since all the group views are consistent, this 
ensures that all the members arrive at the same phost • This rule is applied whenever 
there is a removal of a member committed. 

Rule to determine cwnbr(pi): The clockwise neighbor is always the member from whom 
the status query is received x.e., cwnbr[pi) = p mon . 



This rule is is applied whenever status query comes from a member other than the 
current cwnbr. 

Rule to determine acwnbr(pi): acwnbr(pi) = pj £ GV(pi) such that V pk{^ Pj) € GV{pi) 
Pk ^ Pj and pj & ST Pi . 

This rule is applied whenever a timeout on the arrival of status report from the current 
acwrtbr occurs and when there is a departure or join being committed. 

Exception: If pj = phost and 3 a pj such that ST Pi (pj ) changes from JoinAgreed to 
JoinPending or gets committed, acwnbr{pi) — pj. Upon a join, this ensures that phost 
determines the correct member to monitor. 



3 The Group Membership Protocol 

In Fig. 2, the interaction of the GMP with the application and the network is shown. 
The network is abstracted as a set of reliable FIFO channels. The application generates the 
requests to join a particular group or requests the current view of a group it is a member 
of. In case a group already exists, the GMP has the ability to obtain the address of the 
nearest site with a member of the requested group running. If no site with the group is 
found running, it starts a new group. 

Generation of a join request results in an instance of the GMP being started on the appli- 
cation site. This instance acquires the membership of the desired group and maintains the 
view information until the member departs from the group. The status change detection, 
agreement phase, and commit phase are described below. 
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Figure 2: GMP interaction with the external world 

3.1 Status Change Detection 

Figure 3 shows the algorithm each member executes to monitor its anticlockwise neighbor 
and initiate an agreement token if a departure is detected. The Monitor process is triggered 
by the local clock. The clockwise and anticlockwise neighbors are computed according to the 
rules given earlier in every iteration of the while loop. If a status message is not received, 
it shuts off communication with the member perceived to have departed (to prevent receipt 
of an excessively delayed response), updates the local status table, generates and adds an 
agreement token to the local pool of tokens, and sends it to the clockwise neighbor. 

If this member turns out to have already departed, the status reporting instrument shown in 
Fig. 4 ensures that the token will get sent to the next clockwise operational member. When 
a change in the querying member is detected, the token pool gets sent to the new querying 
member in addition to the status response. It recognizes a change in the querying member 
by inspecting p mon to send its token pool. ReportStatus does not compute the clockwise 
neighbor, but simply responds to the sender of the query. 
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Monitor process at p,- 

1 while (true) 

2 send status query to acwnbr(pi ); 

3 wait for 7pad', /*local timeout interval*/ 

4 if (status message not received) 

5 shut off communication with acwnbr(pi ); 

6 ST Pt (acwnbr(pi)) <— Departure Agreed', 

7 generate agree Pl { acwnbr(pi))', 

8 add agree Pi (pj) to TknPool(pi); 

9 send agree Pi (acwnbi\pi )) to cwnbr(pi)\ 

10 else 

11 W ait for 7 query period' 

12 end if; 

13 end while; 
end Monitor. 



Figure 3: Protocol for Monitoring and Agreement Initiation 



ReportStatus process at p,- 

1 if (querying member ± p mon ) 

2 send TknPool(pi) to the querying member; 

3 Pmon = querying member; 

4 end if; 

5 send status to p mcm : 
end ReportStatus. 



Figure 4: Protocol for Reporting the Status 
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InitiateJoin for a request message/ token for p new at pi 

1 while (true) 

2 receive join request message or joinreq token for p new 

2.1 until (pneu, g ST Pi ); 

3 if {pho.t - Pi) 

4 generate a^rce K (p Be w); 

5 STp^pneu) *— Join Agreed; 

6 add agretp^pnew) to TknPool(pi); 

7 send agree Pi (pne W ) to cwnbr[pi); 

8 else 

9 STp^pnew) Jo in Requested; 

10 add joinreq Pi (p new ) to TknPool(pi); 

11 if (join request) /*Pne w contacts pi first*/ 

12 generate joinreq Vx (pne W ) token; 

13 send joinreq token to cwnbr, 

14 end if; 

15 end while; 
end InitiateJoin. 



Figure 5: Algorithm to initiate a join 

When the application generates a request to join a group, an instance of the GMP gets 
spawned. It obtains the address of the nearest site running a member and sends a join 
request message to it and waits for an intimation of the request approval for a preset interval 
before resending the request. Before the request is resent, the nearest site address running a 
member is searched again. The receiving member pi runs an algorithm as specified in Fig. 5. 
A non-host member, receiving a request message for the first time generates joinreq Pi (p new 
token and adds it to the local token pool. It enters status J oinRequested for p new in its status 
table and sends the token to its cwnbr. A duplicate join request is rejected on the basis of 
an entry for p new in the local status table. If the member receiving the request message or 
token is the ring host, it generates the agreement token, updates the local status table and 
token pool, and sends it to its cwnbr. 
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3.2 The Agreement Phase 



The algorithm used to process an agreement token is shown in Fig. 6. If the member that 
receives an agreement token for the first time is not its initiator, it must simply pass it on to 
its clockwise neighbor after adding it to its token pool and updating the local status table 
(lines 15-19 of Fig. 6). However, if it is the initiator of the token, it must generate a commit 
token when the token has circulated back to it. Receiver of an agree token must also generate 
a commit token if the initiator had departed after generating the agreement token, and as 
a result, a duplicate agreement token is received at a member. In this case, the member 
generating the commit token will have an entry in its local status table for the initiator of 
the token (line 1, Fig. 6). 

Any member commits a change to its view when it processes a commit token for the change. 
Thus, the initiator of a commit token commits the corresponding change locally and sends it 
to the clockwise neighbor. There are two aspects to committing a change in the group view 
in this protocol. Firstly, since the ring configuration may lead to the arrival orders of two 
commit tokens to be opposite at two different members along the ring, the changes must be 
committed in a consistent order at all the members. Secondly, when a change is committed, 
it must be ensured that all the protocol-related entities are correctly updated. 

The correct ordering of all changes is based on the rank of the member whose status change 
is being processed. The ordering is imposed at the initiator of the commit token as follows: 
if the rank of the member with the changed status is the lowest among all the members for 
which there is an agreement token in the token pool, a commit token is generated. Otherwise, 
commit token generation is kept pending until all changes for members with a higher rank 
have been committed (lines 5-13, Fig. 6). 

Update of all the protocol-related quantities upon committing a change are encapsulated 
as CommitChange, whose steps are shown in Fig. 7. Aside from passing the token on to 
the clockwise neighbor, the local membership, view number, status table, and token pool 
must be updated. Line 5 determines the token pool update policy that garbage-collects old 
commit tokens. The principle followed in this update is that a token should be deleted from 
the TknPool only when the member is certain that its use is over. A member keeps its token 
pool ordered according to their arrival times, inspects all the tokens in it, and deletes all the 
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ProcessAgreementTkn for agree Pj (pk) at p; 

/*A commit must be generated either when I am the 
agreement initiator or when a duplicate token is received 
due to departure of the agreement initiator pj*/ 

1 if (( pi = pj) || ((pj / pi) &z£z (duplicate token) (pj G ST Pi ))) 

2 if (no unprocessed agreement token in TknPool) 

3 generate commit Pi (pk)', 

4 CommitChange; 

5 else 

6 compute rank Vp* G ST Pi with Agreed status; 

7 if (rank(pfc) is smallest) 

8 generate commit Pi (pk)\ 

9 CommitChange; 

10 else 

/^depending upon whether for join or departure of p^*/ 

11 ST Pi (pk) <— Departure Pen ding or JoinPending ; 

12 end if; 

13 end if; 

14 else 

15 if (((pj ^ Pi) (not a duplicate agree Pj (pk) 

16 add agree Pj {pk) to TknPool ; 

17 ST Pi (p k ) *— Departure Agreed or JoinAgreed ; 

18 send agree p .(pk) to cwnbr(pi)\ 

19 end if; 

20 end if; 

end ProcessAgreementTkn. 



Figure 6: Protocol for Agreement Tokens 
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CommitChange for commi,t p .(pk) at pi 

/^Depending on whether a join or departure*/ 

1 add or delete p^ from GV(pi); 

2 delete pk entry from 5T Pi : 

3 vn(pi) <— vn(pi) + 1; 

4 send commit p -(pk) to cwnbr(pi)\ 

5 delete all commit tokens received before 

agree Pj (pk) from TknPool(pi); 

6 if join committed delete joinreq Pi (pk)\ 

7 delete agree Pi (pk)', 

8 add commit Pj (pk) to TknPool(pi); 

9 determine new 

10 if ((join committed) fcfc (pho„t = Pi)) 

11 update acwnbijpi ); 

12 send ST Pi . TknPool(pi ), and G\\pi) to acwnbr(pi); 

13 end if: 

end CommitChange. 



Figure 7: Protocol for Committing a Change 

commit tokens received before the agreement token for the change committed. The commit 
token just processed is not deleted in case the member it is sent to departs before receiving 
it. 



If the member committing a join is the host, it updates the anticlockwise neighbor to be the 
new member and sends the local state to it (lines 11-12. Fig. 7). It also determines a new 
host (line 9), phost for the ring according to the rule given at the end of section 2. 



3.3 The Commit Phase 



The processing of a commit token as it circulates around the ring is shown in Fig. 8. If a 
member is the commit initiator (be., the token has circulated back) or if the commit token 
is received again, it simply exits. This indicates completion of the processing required at 
all members for that particular change. If it is received for the first time at a member, 
appropriate commit action must take place (line 4, Fig. 8). After committing the change 
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ProcessCommitTkn for commit p -(pk ) at pi 

1 if ((pi = pj) || (duplicate)) 

2 exit; 

3 else 

4 CommitChange; 

5 while (3 pi G ST Pi with a higher rank & pending status 

received before agree Pj (pk )) 

6 CommitChange; 

7 end while; 

8 end if; 

end ProcessCommitTkn. 



Figure 8: Protocol to process a commit token 

specified in this token, it is likely that a change for which a commit token generation was 
kept pending locally, can now be committed and propagated because it now has the lowest 
rank. All such pending changes can now be processed (lines 5-7, Fig. 8). 



3.4 Ensuring an Identical Sequence of Commits 



As members perceive departures/joins around the ring, they initiate agreement phases inde- 
pendently. Therefore, in this protocol, it is possible for multiple agreement phases to proceed 
simultaneously around the ring resulting in multiple commit tokens that circulate around 
the ring at the same time. The two changes divide the ring in two pieces. Clearly, the order 
in which these commits reach the members in these two pieces will be opposite. An identical 
order is maintained in this situation, as specified by lines (2 - 12) of Fig. 6. 

When a commit token is to be generated, it is first checked to see if there are any unprocessed 
agreement tokens in the token pool. If there are, commits resulting from these are ordered 
identically around the ring; otherwise, a commit token is generated and change committed 
(lines 3 - 4). If there are unprocessed agreement tokens in the token pool, the commit 
initiator determines if the member for which a commit is to be initiated has the smallest 
rank among all the members for which there are unprocessed agreement tokens (lines 6-9). 
Agreement tokens for joins in the pool do not matter because members always join with the 
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highest rank. 



It should be remembered that the rank of a member is its distance from phoat in the clockwise 
direction. If the rank is not the smallest, the local status is marked as pending (line 11) 
and the change is committed and propagated at a later time. Thus, use of the rank ensures 
that all the members commit in the same order around the ring. It should be noted that the 
pending status for a change gets marked only in the commit initiator. 



4 Proof of Correctness 



Proposition 1: No tokens are lost if a member updates its TknPool using CommitChange. 
Proof: If Pi receives commit Pj (pk), it is guaranteed to have received agree Pj (pk) some 
time previously because the agreement phase is followed by the commit phase. Obviously, 
agree Pj (pk) has circulated completely around the ring. Suppose 3 a commit Pl {pm) received 
at pi before agree Pj (pk). Thus, in between the arrivals of commit Pl (p m ) and commit Pj (pk) 
at pi . 3 a token, viz. agree Pj (pk ), that has circulated around the ring completely. This 
implies that, due to the FIFO property of channels, commitp^pm) has circulated around 
the ring completely also, regardless of the locations of pi,pj , and pi around the ring. Thus, 
commit Pl [p m ) has served its purpose and can be deleted from the TknPool at pi . Therefore, 
both, agree p .{pk) and commitp^pm) have completed their use and can be deleted. By adding 
commit Pj (pk) to the TknPool at p;. its update is complete. Since this token pool is sent to 
the cumbr(pi) according to Report Status, tokens are never lost. ■ 

Proposition 2: Exactly one pi determines itself to be phoat • 

Proof: CommtChange determines a host only when it commits a departure for the current 
Phoat * According to the rule for determining the new host, only the local group view is 
inspected and the clockwise neighbor of the departed host is determined to be new phoat- 
According to Proposition 1, no tokens are lost. Therefore, the commit token for the departure 
of the old host is processed by every member. Since the host had rank 0, which is always 
the lowest, every member determines the same member as the new phoat • ■ 

Proposition 3: An agreement phase is always started. 
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Proof: In case of a departure perceived by a member, say pi, it may itself depart before 
initiating the agreement token or after sending it. In the latter case, the commit phase is 
carried out by cwnbr(pi). In the former case, cwnbr(pi) perceives the departure of p; and 
initiates an agreement phase. It attempts to monitor acwnbr(pi ) whose agreement pi could 
not initiate. cwnbr{pi) perceives acwnbr(pi) as departed also and initiates an agreement 
phase for it. This sequence of events is extended if there is a string of departures. Therefore, 
the agreement phase for a departure is always started. 

In case of a join, if pi is the host and fails before initiating the agreement phase for a join, 
cwnbr{pi ) determines itself to be the new host and receives the joinreq token as part of the 
TknPool to initiate the agreement phase. Since tokens are never lost, once a join request has 
been received by an operational member, an agreement phase for its join is always started. 



Proposition 4: The joining member and pho,t behave consistently after the agreement ini- 
tiation. 

Proof: Phoat sends its GV, ST , TknPool , and vn to the joining member p neU) . The exception 
to the rule to compute the acumbr ensures that the logical ring is correctly configured with 
Pnew as the highest rank member. When the acwnbrjphoat ) before the join notices that the 
querying member is different from its p mon • it becomes aware of the new member in the ring 
and sends its TknPool to it. Therefore, all tokens that are passed to Phost while the state 
transfer to p new is taking place are sent to p new . This ensures that p new behaves consistently 

With Phoat ■ ■ 



Theorem 1: The proposed protocol correctly solves the GMP stated as 

Vpi € GV vn (pj) andV n < vn,GV n (pj) = GV n (pi) 

given that all members start with the same initial group view (GV o )■ 

Proof: We provide a proof by induction. 

Base Case: Vp.-.pj G GV 0 (pk ), GVo(pi ) = GVo(pj) at system initialization. 

Induction Hypothesis: Assume that 3k > 1 £ N such that Vp;,pj 6 GVk(pj) GVk(pi) = 

GV t ( Pi ). 
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We now prove that the next change committed by any two members is identical. Consider 
any pi,pj E GVk+i(pj). Without loss of generality, let commit Pk (pi) be the next change to 
be committed by pj. There are two cases. 

Case 1 - Pj ^ Pi : It is clear from the change detection instruments that pj pi and pi ^ pi. 
Therefore, if a change involving pi is view change (k + 1) committed at pj , either the only 
agreement token pk has at the time of initiating commit Pk (pi ) is for pi or pi has the smallest 
rank among all agreement tokens in the TknPool at p*. Now, a commit token initiated for 
p m such that pm ^4 pi cannot result in view change (k + 1) at p t - because this implies that 
Pm has a lower rank at pi than pi whose agreement token will be part of the TknPool at pi . 
Therefore, agreement token for p m would also be part of the TknPool at p*. and would have 
the smallest rank at the time of initiation of commit p k (pi) . This contradicts the fact that pi 
had the smallest rank at p* or was the only agreement token at pj. Therefore, view change 
(k + 1) committed at pi is due to commit Pk (pi). 

Case 2 - pi ^ pj: In this case, commit Pk (pi) that results in view change (k + 1) at pj must 
first pass through p t - since pi pj and tokens circulate in the clockwise direction. This 
implies that view change (k + 1) at pi is also due to commit Pk (pi). 

Thus, given the induction hypothesis for view change k , we prove that 

Vpi-.Pj € GV k+ i{pj) GV k+ i(pi) = GVk+i (pj ) 

This completes the proof by induction. ■ 



5 Concluding Remarks 



In this report, a group membership protocol for maintaining membership information re- 
quired by virtually synchronous process group based computation is described. It tolerates 
continuous changes to the membership by ordering the members of a group using the con- 
cept of a logical ring. In this protocol, identical processing is required to process joins as 
well as departures. The change detection responsibility is evenly distributed among all the 
members. This enables elimination of any need for centralized responsibility. By ordering 
all commits according to the rank of a member as defined by its position in the logical ring, 
the protocol correctness has been proven. 

This protocol does not make any majority- based decisions. Any number of departures can 
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occur and yet the protocol is able to function. Joins and departures can be interleaved since 
they are processed identically. Since there is no centralized responsibility, the overhead for 
committing a change is constant at 2 n, where n is the number of point-to-point messages. 
No special facilities such as broadcast messages, ordered access, synchronized actions are 
required. The protocol simply exploits the reliable FIFO nature of the channels among 
members. The message overhead is superior to [RB91] which is the only other group mem- 
bership protocol that uses a fully connected network of FIFO channels that the authors are 
aware of. 

Currently, this protocol is being implemented on a local area network (Ethernet) of SUN 
workstations using the transport layer interface of SunOS (a Unix variant). Objectives of the 
current work are to characterize the performance of this protocol in terms of the latency of a 
committing a change, the number of changes supported per second, and a comparative eval- 
uation of the impact of this protocol on application level multicasts. Complete connectivity 
among members implies that the network is never partitioned. If the distributed compu- 
tation built over this protocol spans a wide area communication network, this assumption 
must be relaxed. While a correctness proof has been provided here, the current work is also 
aimed at providing a rigorous mathematical proof. 
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